notebooks/Unit 9 - Model Optimization/openvino.ipynb

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# OpenVINO\n", "\n", "In this notebook, we will show how to use the OpenVINO toolkit to deploy deep learning models on edge devices and quantize models to reduce model size and inference latency. We will train a simple CNN model on the MNIST dataset, convert it to OpenVINO IR format, and the quantize the model to INT8 precision. We will then compare the size and performance of the quantized model with the original FP32 model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup OpenVINO\n", "\n", "First, we need to install OpenVINO, NNCF and torch" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install -q \"openvino>=2023.1.0\" torch torchvision --extra-index-url https://download.pytorch.org/whl/cpu\n", "%pip install -q \"nncf>=2.6.0\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import torch\n", "import torch.nn as nn\n", "import torch.nn.functional as F\n", "import torch.optim as optim\n", "from torchvision import datasets, transforms\n", "import pathlib\n", "import numpy as np\n", "import openvino as ov\n", "import nncf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Train Model\n", "\n", "Next, define and train a simple CNN model on the MNIST dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "transform=transforms.Compose([\n", " transforms.ToTensor(),\n", " transforms.Normalize((0.1307,), (0.3081,))\n", " ])\n", "\n", "train_dataset = datasets.MNIST('./data', train=True, download=True,transform=transform)\n", "test_dataset = datasets.MNIST('./data', train=False,transform=transform)\n", "\n", "class Net(nn.Module):\n", " def __init__(self):\n", " super(Net, self).__init__()\n", " self.conv1 = nn.Conv2d(in_channels=1, out_channels=12, kernel_size=3)\n", " self.pool = nn.MaxPool2d(kernel_size=2, stride=2)\n", " self.fc = nn.Linear(12 * 13 * 13, 10)\n", "\n", " def forward(self, x):\n", " x = x.view(-1, 1, 28, 28) \n", " x = F.relu(self.conv1(x))\n", " x = self.pool(x)\n", " x = x.view(x.size(0), -1) \n", " x = self.fc(x)\n", " output = F.log_softmax(x, dim=1)\n", " return output\n", "\n", "\n", "train_loader = torch.utils.data.DataLoader(train_dataset, 32)\n", "test_loader = torch.utils.data.DataLoader(test_dataset, 32)\n", "\n", "device = \"cpu\"\n", "\n", "epochs = 1\n", "\n", "model = Net().to(device)\n", "optimizer = optim.Adam(model.parameters())\n", "\n", "model.train()\n", "\n", "for epoch in range(1, epochs+1):\n", " for batch_idx, (data, target) in enumerate(train_loader):\n", " data, target = data.to(device), target.to(device)\n", " optimizer.zero_grad()\n", " output = model(data)\n", " loss = F.nll_loss(output, target)\n", " loss.backward()\n", " optimizer.step()\n", " print('Train Epoch: {} [{}/{} ({:.0f}%)]\\tLoss: {:.6f}'.format(\n", " epoch, batch_idx * len(data), len(train_loader.dataset),\n", " 100. * batch_idx / len(train_loader), loss.item()))\n", "\n", "MODEL_DIR = pathlib.Path(\"./models\")\n", "MODEL_DIR.mkdir(exist_ok=True)\n", "torch.save(model.state_dict(), MODEL_DIR / \"original_model.p\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Convert to OpenVINO IR\n", "\n", "Then, convert the model to OpenVINO IR format" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "core = ov.Core()\n", "example_input = next(iter(test_loader))[0]\n", "ov_model = ov.convert_model(model, example_input=example_input)\n", "ov.save_model(ov_model, MODEL_DIR / f\"openvino_ir.xml\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Quantization\n", "\n", "To quantize the model using NNCF, first, create a transformation function to convert torch tensor to NumPy array and then use the created function together with a pytorch data loader to create calibration dataset using `Dataset` class from NNCF. Next, quantize the model using the `quantize` function from NNCF. Finally, compile the quantized model and save as OpenVINO IR format." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def transform_fn(data_item):\n", " images, _ = data_item\n", " return images.numpy()\n", "\n", "calibration_dataset = nncf.Dataset(train_loader, transform_fn)\n", "quantized_model = nncf.quantize(ov_model, calibration_dataset)\n", "model_int8 = ov.compile_model(quantized_model)\n", "input_fp32 = next(iter(test_loader))[0][0:1]\n", "res = model_int8(input_fp32)\n", "ov.save_model(quantized_model, MODEL_DIR / f\"quant_openvino_ir.xml\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Check Size\n", "\n", "Compare the size of the FP32 and INT8 models" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%ls -lh {MODEL_DIR}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Check Accuracy\n", "\n", "Evaluate the accuracy of the INT8 models and compare it with the FP32 model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def test_ov(model, data_loader):\n", " compiled_model = ov.compile_model(model)\n", " test_loss = 0\n", " correct = 0\n", " for data, target in data_loader:\n", " output = torch.tensor(compiled_model(data)[0])\n", " test_loss += F.nll_loss(output, target, reduction='sum').item() # sum up batch loss\n", " pred = output.argmax(dim=1, keepdim=True) # get the index of the max log-probability\n", " correct += pred.eq(target.view_as(pred)).sum().item()\n", "\n", " test_loss /= len(data_loader.dataset)\n", "\n", " return 100. * correct / len(data_loader.dataset)\n", "\n", "acc = test_ov(ov_model, test_loader)\n", "print(f\"Accuracy of original model: {acc}\")\n", "\n", "qacc = test_ov(quantized_model, test_loader)\n", "print(f\"Accuracy of quantized model: {qacc}\")" ] } ], "metadata": { "kernelspec": { "display_name": "model_optimization", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" } }, "nbformat": 4, "nbformat_minor": 2 }

notebooks/Unit 9 - Model Optimization/openvino.ipynb (241 lines of code) (raw):